Final Report for the IPSC Exploratory Research Project Cross - lingual Indexing ( 4 / 2001 – 3 / 2003 )
نویسنده
چکیده
Cross-lingual information access: providing content descriptors in one language for texts written in another, by assigning Eurovoc thesaurus de-scriptors automatically.
منابع مشابه
Event-Coreference across Multiple, Multi-lingual Sources in the Mumis Project
We present our work on information extraction from multiple, multi-lingual sources for the Multimedia Indexing and Searching Environment (MUMIS), a project aiming at developing technology to produce formal annotations about essential events in multimedia programme material. The novelty of our approach consists on the use of a merging or cross-document coreference algorithm that aims at combinin...
متن کاملNTCIR-3 Chinese, Cross Language Retrieval Experiments Using PIRCS
We participated in the monolingual Chinese, English-Chinese cross language and multilingual retrieval tasks using our PIRCS retrieval system. For monolingual, bigram and short-word indexing (both with single characters) were employed for representation. Two separate retrieval lists were obtained and later combined as final result for some submissions. For cross-lingual and multilingual retrieva...
متن کاملEnglish-Japanese Cross-lingual Query Expansion Using Random Indexing of Aligned Bilingual Text Data
Vector space models can be used for extracting semantically similar words from the co-occurrence statistics of words in large text data. In this paper, we report on our NTCIR 2002 experiments using the Random Indexing vector space method for extracting an English-Japanese cross-lingual thesaurus from aligned English-Japanese bilingual data. The crosslingual thesaurus has been used for automatic...
متن کاملThe NTCIR Workshop : the First Evaluation Workshop on Japanese Text Retrieval and Cross-Lingual Information Retrieval
This paper introduces the outline of the first NTCIR Workshop, which is the first evaluation workshop designed to enhance research in Japanese text retrieval and cross-lingual information retrieval. The test collection used in the Workshop consists of more than 330,000 documents with more than half are EnglishJapanese paired. Twenty-three groups from four countries have conducted IR tasks and s...
متن کاملReport on CLEF-2003 Experiments: Two Ways of Extracting Multilingual Resources from Corpora
We present in this report two main approaches to cross-language information retrieval based on the exploitation of multilingual corpora to derive cross-lingual term-term correspondences. These two approaches are evaluated in the framework of the multilingual-4 (ML4) task.
متن کامل